15 research outputs found

    Weighted Sampled Split Learning (WSSL): Balancing Privacy, Robustness, and Fairness in Distributed Learning Environments

    Full text link
    This study presents Weighted Sampled Split Learning (WSSL), an innovative framework tailored to bolster privacy, robustness, and fairness in distributed machine learning systems. Unlike traditional approaches, WSSL disperses the learning process among multiple clients, thereby safeguarding data confidentiality. Central to WSSL's efficacy is its utilization of weighted sampling. This approach ensures equitable learning by tactically selecting influential clients based on their contributions. Our evaluation of WSSL spanned various client configurations and employed two distinct datasets: Human Gait Sensor and CIFAR-10. We observed three primary benefits: heightened model accuracy, enhanced robustness, and maintained fairness across diverse client compositions. Notably, our distributed frameworks consistently surpassed centralized counterparts, registering accuracy peaks of 82.63% and 75.51% for the Human Gait Sensor and CIFAR-10 datasets, respectively. These figures contrast with the top accuracies of 81.12% and 58.60% achieved by centralized systems. Collectively, our findings champion WSSL as a potent and scalable successor to conventional centralized learning, marking it as a pivotal stride forward in privacy-focused, resilient, and impartial distributed machine learning

    Trust-Based Cloud Machine Learning Model Selection For Industrial IoT and Smart City Services

    Get PDF
    With Machine Learning (ML) services now used in a number of mission-critical human-facing domains, ensuring the integrity and trustworthiness of ML models becomes all-important. In this work, we consider the paradigm where cloud service providers collect big data from resource-constrained devices for building ML-based prediction models that are then sent back to be run locally on the intermittently-connected resource-constrained devices. Our proposed solution comprises an intelligent polynomial-time heuristic that maximizes the level of trust of ML models by selecting and switching between a subset of the ML models from a superset of models in order to maximize the trustworthiness while respecting the given reconfiguration budget/rate and reducing the cloud communication overhead. We evaluate the performance of our proposed heuristic using two case studies. First, we consider Industrial IoT (IIoT) services, and as a proxy for this setting, we use the turbofan engine degradation simulation dataset to predict the remaining useful life of an engine. Our results in this setting show that the trust level of the selected models is 0.49% to 3.17% less compared to the results obtained using Integer Linear Programming (ILP). Second, we consider Smart Cities services, and as a proxy of this setting, we use an experimental transportation dataset to predict the number of cars. Our results show that the selected model's trust level is 0.7% to 2.53% less compared to the results obtained using ILP. We also show that our proposed heuristic achieves an optimal competitive ratio in a polynomial-time approximation scheme for the problem

    Parameters Optimization of Deep Learning Models using Particle Swarm Optimization

    Full text link
    Deep learning has been successfully applied in several fields such as machine translation, manufacturing, and pattern recognition. However, successful application of deep learning depends upon appropriately setting its parameters to achieve high quality results. The number of hidden layers and the number of neurons in each layer of a deep machine learning network are two key parameters, which have main influence on the performance of the algorithm. Manual parameter setting and grid search approaches somewhat ease the users tasks in setting these important parameters. Nonetheless, these two techniques can be very time consuming. In this paper, we show that the Particle swarm optimization (PSO) technique holds great potential to optimize parameter settings and thus saves valuable computational resources during the tuning process of deep learning models. Specifically, we use a dataset collected from a Wi-Fi campus network to train deep learning models to predict the number of occupants and their locations. Our preliminary experiments indicate that PSO provides an efficient approach for tuning the optimal number of hidden layers and the number of neurons in each layer of the deep learning algorithm when compared to the grid search method. Our experiments illustrate that the exploration process of the landscape of configurations to find the optimal parameters is decreased by 77%-85%. In fact, the PSO yields even better accuracy results

    CCTFv1: Computational Modeling of Cyber Team Formation Strategies

    Full text link
    Rooted in collaborative efforts, cybersecurity spans the scope of cyber competitions and warfare. Despite extensive research into team strategy in sports and project management, empirical study in cyber-security is minimal. This gap motivates this paper, which presents the Collaborative Cyber Team Formation (CCTF) Simulation Framework. Using Agent-Based Modeling, we delve into the dynamics of team creation and output. We focus on exposing the impact of structural dynamics on performance while controlling other variables carefully. Our findings highlight the importance of strategic team formations, an aspect often overlooked in corporate cybersecurity and cyber competition teams

    Trust-Based Cloud Machine Learning Model Selection for Industrial IoT and Smart City Services

    Get PDF
    With machine learning (ML) services now used in a number of mission-critical human-facing domains, ensuring the integrity and trustworthiness of ML models becomes all important. In this work, we consider the paradigm where cloud service providers collect big data from resource-constrained devices for building ML-based prediction models that are then sent back to be run locally on the intermittently connected resource-constrained devices. Our proposed solution comprises an intelligent polynomial-time heuristic that maximizes the level of trust of ML models by selecting and switching between a subset of the ML models from a superset of models in order to maximize the trustworthiness while respecting the given reconfiguration budget/rate and reducing the cloud communication overhead. We evaluate the performance of our proposed heuristic using two case studies. First, we consider Industrial IoT (IIoT) services, and as a proxy for this setting, we use the turbofan engine degradation simulation data set to predict the remaining useful life of an engine. Our results in this setting show that the trust level of the selected models is 0.49%-3.17% less compared to the results obtained using integer linear programming (ILP). Second, we consider smart cities services, and as a proxy of this setting, we use an experimental transportation data set to predict the number of cars. Our results show that the selected model's trust level is 0.7%-2.53% less compared to the results obtained using ILP. We also show that our proposed heuristic achieves an optimal competitive ratio in a polynomial-time approximation scheme for the problem

    Machine Learning-Based Peripheral Artery Disease Identification Using Laboratory-Based Gait Data

    Get PDF
    Peripheral artery disease (PAD) manifests from atherosclerosis, which limits blood flow to the legs and causes changes in muscle structure and function, and in gait performance. PAD is underdiagnosed, which delays treatment and worsens clinical outcomes. To overcome this challenge, the purpose of this study is to develop machine learning (ML) models that distinguish individuals with and without PAD. This is the first step to using ML to identify those with PAD risk early. We built ML models based on previously acquired overground walking biomechanics data from patients with PAD and healthy controls. Gait signatures were characterized using ankle, knee, and hip joint angles, torques, and powers, as well as ground reaction forces (GRF). ML was able to classify those with and without PAD using Neural Networks or Random Forest algorithms with 89% accuracy (0.64 Matthew’s Correlation Coefficient) using all laboratory-based gait variables. Moreover, models using only GRF variables provided up to 87% accuracy (0.64 Matthew’s Correlation Coefficient). These results indicate that ML models can classify those with and without PAD using gait signatures with acceptable performance. Results also show that an ML gait signature model that uses GRF features delivers the most informative data for PAD classification
    corecore